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Abstract 

In  this  paper  we  present  a  nonlinear  version  of  the  well- 
known  anomaly  detection  method  referred  to  as  the  RX- 
algorithm.  Extending  this  algorithm  to  a  feature  space  as¬ 
sociated  with  the  original  input  space  via  a  certain  nonlin¬ 
ear  mapping  function  can  provide  a  nonlinear  version  of  the 
RX-algorithm.  This  nonlinear  RX-algorithm,  referred  to  as 
the  kernel  RX-algorithm,  is  basically  intractable  mainly  due 
to  the  high  dimensionality  of  the  feature  space  produced  by 
the  non-linear  mapping  function.  However,  in  this  paper  it 
is  shown  that  the  kernel  RX-algorithm  can  easily  be  imple¬ 
mented  by  kernelizing  it  in  terms  of  kernels  which  implicitly 
compute  dot  products  in  the  feature  space.  Improved  per¬ 
formance  of  the  kernel  RX-algorithm  over  the  conventional 
RX-algorithm  is  shown  by  testing  several  hyperspectral  im¬ 
agery  for  military  target  and  mine  detection. 

1  Introduction 

Anomaly  detectors  are  pattern  recognition  schemes  that  are 
used  to  detect  objects  that  might  be  of  military  interest.  Al¬ 
most  all  the  anomaly  detectors  attempt  to  locate  anything 
that  looks  different  spatially  or  spectrally  from  its  surround¬ 
ings.  In  spectral  anomaly  detection  algorithms,  pixels  (ma¬ 
terials)  that  have  a  significantly  different  spectral  signature 
from  their  neighboring  background  clutter  pixels  are  iden¬ 
tified  as  spectral  anomalies.  Spectral  anomaly  detection 
algorithms  [1-5]  could  also  use  spectral  signatures  to  de¬ 
tect  anomalies  embedded  within  a  background  clutter  with 
a  very  low  signal-to-noise  ratio.  In  spectral  anomaly  detec¬ 
tors,  no  prior  knowledge  of  the  target  spectral  signature  are 
utilized  or  assumed. 

Most  of  the  detection  algorithms  in  the  literature 
[1,  5-7]  assume  that  the  HSI  data  can  be  represented  by 
the  multivariate  normal  (Gaussian)  distribution  and  under 
the  Gaussianity  assumption,  the  generalized  likelihood  ratio 
test  (GLRT)  is  used  to  test  the  hypotheses  to  find  the  exis¬ 
tence  of  a  target  in  the  image.  The  Gaussianity  assumption 
has  been  used  mainly  because  of  mathematical  tractability 
that  allows  the  formation  of  widely  used  detection  models, 
such  as  GLRT.  However,  in  reality  the  HSI  data  might  not 
closely  follow  the  Gaussian  distribution.  Nevertheless,  in 


various  fields  of  signal  processing,  GLRT  is  used  to  detect 
signals  (targets)  of  interest  in  noisy  environments. 

In  this  paper  we  formulated  a  nonlinear  version  of 
the  RX-  algorithm  by  transforming  each  spectral  pixel  into 
a  very  high-  dimensional  feature  space  (could  be  infinite  di¬ 
mension)  by  a  nonlinear  mapping  function.  The  spectral 
pixel  in  the  feature  space  now  consists  of  possibly  the  origi¬ 
nal  spectral  bands  and  a  nonlinear  combination  of  the  spec¬ 
tral  bands  of  the  original  spectral  signature.  Implementing 
the  RX-algorithm  in  the  feature  space,  the  higher  order  cor¬ 
relations  between  spectral  bands  are  exploited,  thus  result¬ 
ing  in  a  nonlinear  RX-algorithm.  However,  this  nonlinear 
RX-algorithm  cannot  be  implemented  directly  due  to  the 
high  dimensionality  of  the  feature  space.  It  is  shown  in 
Section  4  that  because  the  RX-algorithm  consists  of  inner 
products  of  spectral  vectors,  it  is  possible  to  implement  a 
kernel-based  nonlinear  version  of  the  RX-algorithm  by  us¬ 
ing  kernel  functions,  and  their  properties  [8]. 

Kernel-based  versions  of  a  number  of  feature  ex¬ 
traction  or  pattern  recognition  algorithms  have  recently 
been  proposed  [9-14].  In  [12],  a  kernel  version  of  principal 
component  analysis  (PCA)  was  proposed  for  nonlinear  fea¬ 
ture  extraction  and  in  [13]  a  nonlinear  kernel  version  of  the 
Lisher  discriminant  analysis  was  implemented  for  pattern 
classification.  In  [14],  a  kernel-based  clustering  algorithm 
was  proposed  and  in  [10]  kernels  were  used  as  generalized 
dissimilarity  measures  for  classification.  Kernel  methods 
have  also  been  applied  to  face  recognition  in  [9] . 

This  paper  is  organized  as  follows.  Section  2 
provides  an  introduction  to  the  RX-algorithm.  Section  3 
describes  kernel  functions  and  their  relationship  with  the 
dot  product  of  input  vectors  in  the  feature  space.  In  Sec¬ 
tion  4  we  show  the  derivation  of  the  kernel  version  of  the 
RX-algorithm.  Experimental  results  comparing  the  RX- 
algorithm  and  the  kernel-based  RX-algorithm  are  given  in 
Section  5.  Linally,  in  Section  6  conclusion  and  discussion 
are  provided. 


2  Introduction  to  RX-ALGORITHM 

Reed  and  Yu  in  [6]  developed  a  GLR  test,  so  called  the 
RX  anomaly  detection,  for  multidimensional  image  data  as- 
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suming  that  the  spectrum  of  the  received  signal  (spectral 
pixel)  and  the  covariance  of  the  background  clutter  are  un¬ 
known.  Let  each  input  spectral  signal  be  denoted  by  a  vector 
x(n)  =  (xi(n).X2(n). . . .  ,xj(n))T  consisting  of  J  spec¬ 
tral  bands.  Define  X b  to  be  a  J  x  M  matrix  of  the  M  ref¬ 
erence  background  clutter  pixels.  Each  observation  spectral 
pixel  is  represented  as  a  column  in  the  sample  matrix  X& 

x6  =  Ix(l)  x(2)  ...  x(M)].  (1) 

The  two  competing  hypotheses  that  the  RX-algorithm  must 
distinguish  are  given  by 

H0  :  x  =  n,  Target  absent  (2) 

Hi  :  x  =  as  +  n,  Target  present 

where  a  =  0  under  Ho  and  a  —  1  under  Hi,  respectively, 
n  is  a  vector  that  represents  the  background  clutter  noise 
process,  and  s  is  the  spectral  signature  of  the  signal  (target) 
given  by  s  =  [si,  $2,  •  •  •  % sj].  The  target  signature  s  and 
background  covariance  C&  are  assumed  to  be  unknown.  The 
model  assumes  that  the  data  arises  from  two  normal  PDFs 
with  the  same  covariance  matrix  but  different  means.  Under 
Ho  the  data  (background  clutter)  is  modeled  as  JV(0,  C\>) 
and  under  Hi  it  is  modeled  as  C&).  The  background 
covariance  C&  is  estimated  from  the  reference  background 
clutter  data.  The  estimated  background  covariance  Cb  is 
given  by 

1  M 

Cb  =  XAw -  -  p'bf,  o) 

i=  1 

where  jib  is  the  estimated  background  clutter  sample  mean 
given  by 

1  M 

&  =  mEx(<)-  (4) 

i= 1 

Assuming  a  single  pixel  target  r  as  the  observation  test  vec¬ 
tor,  the  expression  for  the  RX-algorithm  is  given  by 

RX{ r)  =  (r  -  /i{,)TC'6_1(r  -  ftb).  (5) 

3  Feature  Space  and  Kernel  Methods 

Suppose  the  input  hyperspectral  data  is  represented  by  the 
data  space  (X  C  7 ZJ)  and  J7  be  a  feature  space  associated 
with  X  by  a  nonlinear  mapping  function 

$  :  X  — y  (6) 

x  i-4  $(x), 

where  x  is  an  input  vector  in  X  which  is  mapped  into  a  po¬ 
tentially  much  higher  dimensional  feature  space.  Using  the 
kernel  trick  (Equation  7),  it  allows  us  to  implicitly  compute 


the  dot  products  in  T  without  mapping  the  input  vectors 
into  T\  therefore,  in  the  kernel  methods,  the  mapping 
does  not  need  to  be  identified.  The  kernel  representation  for 
the  monomial  dot  products  in  T  is  expressed  as 

k(xi,Xj)  =<  $(xi),$(xj)  >  (7) 

=  *(x0  •  *(x,-). 

Equation  7  shows  that  the  dot  products  in  T  can  be  avoided 
and  replaced  by  a  kernel,  a  nonlinear  function  which  can  be 
easily  calculated  without  identifying  the  nonlinear  map  4>. 
Two  commonly  used  kernels  are  the  Gaussian  RBF  kernel: 
fc(x,y)  =  exp(~^x~y^  )  and  Polynomial  kernel:  ((x-y)  + 
0)d. 

4  Kernel  RX-Algorithm 

In  this  section,  we  remodel  the  RX-algorithm  in  the  feature 
space  by  assuming  the  input  data  has  already  been  mapped 
into  a  high  dimensional  feature  space.  The  two  hypotheses 
in  the  nonlinear  domain  are  now 

H0^  :  <3>(x)  =  $(n),  Target  absent  (8) 

Hi^  :  $(x)  =  a<s>{ 3>(s)  +  $(n),  Target  present 

The  corresponding  RX-algorithm  in  the  feature 

space  is 

Rxm r))  =  ($(r)  -  ia6JTC6-41($(r)  -  fcbJ  (9) 

where  C and  ftb^  are  the  estimated  covariance  and  back¬ 
ground  clutter  sample  mean  in  the  feature  space,  respec¬ 
tively,  given  by 

1  M 

m  yy$(xW)  - a^)($(x(*))  -  (io> 

1  4=1 
and 

1  M 

i=  1 

The  nonlinear  RX-algorithm  given  by  Equation  (9) 
is  now  in  the  feature  space  which  cannot  be  implemented 
explicitly  due  to  the  non-linear  mapping  $  which  produces 
a  data  space  of  high  dimensionality.  In  order  to  avoid  im¬ 
plementing  Equation  (9)  directly  we  need  to  kernelize  (9) 
by  using  the  kernel  trick  introduced  in  Section  3. 

The  estimated  background  covariance  matrix  can 
be  represented  by  its  eigenvector  decomposition  or  spectral 
decomposition  as  given  by 

Cb®  =  (12) 


where  A&  is  a  diagonal  matrix  consisting  of  the  eigenvalues 
and  V$  is  a  matrix  whose  columns  are  the  eigenvectors  of 
in  the  feature  space.  The  eigenvector  matrix  V$  is 
given  by 

v$  =  [v4,v|,...],  (13) 

where  is  the  j th  eigenvector  with  non-zero  eigenvalue. 

The  pseudoinverse  of  the  estimated  background 
covariance  matrix  can  also  be  written  as 

C*  =  V*A6"1V*T.  (14) 

Each  eigenvector  in  the  feature  space  can  be  ex¬ 
pressed  as  a  linear  combination  of  the  centered  input  vectors 
4>c(x(i))  =  4>(x(i))  —  jlb in  the  feature  space  as  shown 
by 

M 

vi  =  E^(x«)  =  x^',  as) 

i—l 

where  X6#  =  [$c(x(l))  <Dc(x(2)) . . .  <&c(x(M))]  and  for 
all  the  eigenvectors 

V*  =  Xb*B,  (16) 

where  . . .  ,P3M)T  and  B  = 

(/S1,^2, . . .  ,/3m)t  are  shown  in  [12]  to  be  the  eigen- 
vectors  of  the  kernel  matrix  (Gram  matrix)  K(X&,X&) 
normalized  by  the  square  root  of  their  corresponding 
eigenvalues. 

Substituting  Equation  (16)  into  (14)  yields 

=XHBAb~1BTXTH.  (17) 

Inserting  Equation  (17)  into  (9)  the  nonlinear  RX-algorithm 
can  be  rewritten  as 

f?X($(r))  (18) 

=  ($(r)  -  ftbJTXb,BA^BTXlm r)  -  ftbJ. 

The  dot  product  terms  $(r)TX&^  in  the  feature  space  can 
be  represented  in  terms  of  the  kernel  function: 

$(r)TX6*  (19) 

=  $(r)T([$(x(l))  $(x(2))...$(x(M))] 

1  M 

i—l 

=  (fc(x(i),r)  k(x( 2),r)  ...fc(x(M), r)) 

1  M 
i—l 

i  M 

=  k(Xf,,r)T  —  —  ^  &(x(*),r)  =K^, 

1  i—l 


where  k(X&,r)T  represents  a  vector  whose  entries  are  the 
kernels  &(x(i),r),i  =  1 . . .  M,  and  i^(x(^)?r) 

represents  the  scalar  mean  of  k(X&,  r)T.  Similarly, 

(20) 

1  M 

=  M  E  $(x«)T{[$(x(l))  $(x(2)) . . .  $(x(M))] 
i—l 
M 

i=  1 

1  M 

=  mE^(xW’xW)  k(x(i),x(2))  .  ,.k(x(i),x(M))) 

i—l 

1  M  M 

-  ^2  EEfc(xW’xW) 

i=l  J=1 

M  MM 

=  m  E  k(x(*)>  x&)  -  ^2  E  E  fc(x(*)’  xW) 

i=l  i=l j=l 


Also  using  the  properties  of  the  Kernel  PC  A  [12], 
as  shown  in  Appendix  I,  we  have  the  relationship 

K,-1  =  (21) 

where  we  denote  the  estimated  centered  Gram  matrix  K&  = 
K(X&,  X&)  =  (K)^-  the  M  x  M  kernel  matrix  whose  en¬ 
tries  fc(x^,Xj)  are  the  dot  products  <  4>c(x^),  4>c(xj)  > 
and  M  is  the  total  number  of  background  clutter  samples 
which  can  be  ignored.  Substituting  (19),  (20)  ,  and  (21) 
(without  jj)  into  (18)  the  kernelized  version  of  the  RX- 
algorithm  is  given  by 

RXk(v)  =  (Kj  -  Kl)TK~\Kj  -  Kl)  (22) 

which  can  now  be  implemented  with  no  knowledge  of  the 
mapping  function  4>.  The  only  requirement  is  a  good  choice 
for  the  kernel  function  k.  Note  that  K&  is  the  centered  Gram 
matrix,  as  shown  in  [8].  The  centered  K&  is  given  by 

Kb  =  ( Kb  —  ljvK&  —  KblN  +  1jvK&1at),  (23) 

where  Kb  is  the  Gram  matrix  before  centering  and  the  ele¬ 
ments  of  the  N  x  N  matrix  (1  jv)ij  =  1  /N. 

5  Simulation  Results 

In  this  section,  we  apply  both  the  kernel  RX-  and  conven¬ 
tional  RX-algorithms  to  two  HYDICE  images  -  the  Forest 
Radiance  I  (FR-I)  image  and  the  Desert  Radiance  II  (DR- 
II)  image  -  and  the  hyperspectral  mine  image,  as  shown  in 
Fig.  1.  FR-I  includes  total  14  targets  and  DR-II  contains 


6  targets  along  the  road;  all  the  targets  are  military  vehi¬ 
cles.  The  hyperspectral  mine  image  contains  a  total  of  33 
surface  mines.  A  HYDICE  imaging  sensor  generates  210 
bands  across  the  whole  spectral  range  (0.4  -  2.5  /im),  but 
we  only  use  150  bands  by  discarding  water  absorption  and 
low  signal  to  noise  ratio  (SNR)  bands;  the  bands  used  are 
the  23rd-101st,  109th-136th,  and  152nd-194th.  The  hy¬ 
perspectral  mine  image  consists  of  70  bands  whose  spectral 
range  spans  8-11.5  jjtm. 

Gaussian  RBF  kernel,  k(x,y)  =  exp(~^x~y^  ), 
was  used  to  implement  the  kernel  RX-algorithm;  the  value 
of  c  was  set  to  40.  All  the  pixel  vectors  in  the  test  image  are 
first  normalized  by  a  constant,  which  is  a  maximum  value 
obtained  from  all  the  spectral  components  of  the  spectral 
vectors  in  the  corresponding  test  image,  so  that  the  entries 
of  the  normalized  pixel  vectors  fit  into  the  interval  of  spec¬ 
tral  values  between  zero  and  one.  The  rescaling  of  pixel 
vectors  was  mainly  performed  to  effectively  utilize  the  dy¬ 
namic  range  of  Gaussian  RBF  kernel. 

The  kernel  matrix  K&  can  be  estimated  either  glob¬ 
ally  or  locally.  The  global  estimation  must  be  performed 
prior  to  detection  and  normally  needs  a  large  amount  of 
data  samples  to  successfully  represent  all  the  background 
types  present  in  a  given  data  set.  In  this  paper,  to  glob¬ 
ally  estimate  K&  we  need  to  use  all  the  spectral  vectors  in  a 
given  test  image.  A  well-known  data  clustering  algorithm, 
&-means  [15],  is  used  on  all  the  spectral  vectors  in  order 
to  generate  a  significantly  less  number  of  spectral  vectors 
(centroids)  from  which  K&  is  estimated.  By  using  a  small 
number  of  distinct  background  spectral  vectors  a  manage¬ 
able  kernel  matrix  is  generated  where  a  more  efficient  ker¬ 
nel  RX-algorithm  is  now  implemented.  The  number  of  the 
representative  spectral  vectors  obtained  from  the  &-means 
procedure  was  set  to  600,  which  means  the  number  of  cen¬ 
troids  generated  by  the  &-means  was  600. 

For  local  estimation  of  K&  we  use  local  back¬ 
ground  samples,  which  are  from  the  neighboring  area  of  the 
pixel  being  tested.  For  each  test  pixel  location,  a  dual  con¬ 
centric  rectangular  window  is  used  to  separate  a  local  area 
into  two  regions  -  the  inner- window  region  (IWR)  and  the 
outer- window  region  (OWR),  as  shown  in  Fig.  2;  the  lo¬ 
cal  kernel  matrix  and  the  background  covariance  matrix  are 
calculated  from  the  pixel  vectors  in  the  OWR.  The  test  pixel 
vector  r  was  obtained  from  the  IWR. 

The  dual  concentric  windows  naturally  divide  the 
local  area  into  the  potential  target  region  -  the  IWR  -  and 
the  background  region  -  the  OWR  -  whose  local  statistics 
in  the  original  and  nonlinear  feature  domain  are  compared 
using  the  conventional  RX-  and  kernel  RX-  algorithms,  re¬ 
spectively.  The  size  of  the  IWR  is  set  to  enclose  targets  to  be 
detected  whose  approximate  size  is  based  on  prior  knowl¬ 
edge  of  the  range,  field  of  view  (FOV),  and  the  dimension 
of  the  biggest  target  in  the  given  data  set.  Similarly,  the 
size  of  the  OWR  is  set  to  include  sufficient  statistics  from 


the  neighboring  background.  The  size  for  the  dual  windows 
used  were  5x5  and  13x13  pixel  areas,  respectively.  The 
size  of  the  OWR  was  set  to  include  a  sufficient  number  of 
spectral  vectors  to  generate  the  kernel  matrix  K&. 

Figs.  3,  4,  and  5  show  the  anomaly  detection  re¬ 
sults  of  both  the  kernel  RX  and  the  conventional  RX  using 
the  local  dual  window  applied  to  the  FR-I  and  DR-II  im¬ 
ages  and  the  hyperspectral  mine  image,  respectively.  The 
kernel  RX  detected  most  of  the  targets  and  mines  with  a 
few  false  alarms  while  the  conventional  RX  generated  much 
more  false  alarms  and  missed  some  targets;  especially,  in 
the  case  of  FR-I  the  conventional  RX  missed  7  successive 
targets  from  the  left.  For  both  the  HYDICE  images  and  the 
mine  image  the  kernel  RX  showed  significantly  improved 
performance  over  the  conventional  RX. 

Figs.  6  and  7  show  the  ROC  curves  for  the  detec¬ 
tion  results  for  FR-I  and  DR-II  images,  as  shown  in  Figs.  3 
and  4,  using  the  kernel  RX  and  the  conventional  RX  based 
on  the  local  dual  window.  Figs.  6  and  7  also  include  the 
ROC  curves  for  the  kernel  RX  based  on  the  global  ker¬ 
nel  matrix.  The  global  method  for  the  kernel  RX  provided 
slightly  improved  performance  over  the  local  method  for 
the  HYDICE  images  that  were  tested.  Fig.  8  shows  the  the 
ROC  curves  for  the  detection  results  for  the  hyperspectral 
mine  image,  as  shown  in  Fig.  5,  using  the  kernel  RX  and 
the  conventional  RX  based  on  the  local  dual  window.  Note 
that  the  kernel  RX  significantly  outperformed  the  conven¬ 
tional  RX  at  lower  false  alarm  rates. 


6  Conclusions 

We  have  extended  the  RX-algorithm  to  a  nonlinear  feature 
space  by  kernelizing  the  corresponding  nonlinear  GFRT  ex¬ 
pression.  The  GFRT  expression  of  the  kernel  RX  is  similar 
to  the  conventional  RX,  but  every  term  in  the  expression  is 
in  kernel  forms  which  can  be  readily  calculated  in  terms  of 
the  input  data  in  the  original  space.  The  kernel  RX  showed 
superior  detection  performance  over  the  conventional  RX 
given  the  HYDICE  images  tested.  This  is  mainly  because 
the  high  order  correlations  between  the  spectral  bands  are 
exploited  by  the  kernel  RX. 


Appendix  I 

In  this  Appendix  derivation  of  Kernel  PCA  and  its  prop¬ 
erties  providing  the  relationship  between  the  covariance 
matrix  and  the  corresponding  Gram  matrix  are  presented. 
Our  goal  here  is  to  prove  expression  (21).  To  drive  the 
Kernel  PCA  consider  the  background  clutter  covariance 
matrix  in  feature  space  for  the  centered  data  = 


[  $c(xi)  $C(X2)  ...  $c(xM)  ] 

CH=XHXl.  (24) 

The  PCA  eigenvectors  are  computed  by  solving  the  eigen¬ 
value  problem 

Av$  =  (25) 

1  M 

=  Yj  $c(x*)^c(x*)Tv# 

•  i=  1 

1  M 

=  $c(Xi),Va,  >  $c(Xj). 

i=l 

where  v$  is  an  eigenvector  in  T  with  a  corresponding 
nonzero  eigenvalue  A.  Equation  (25)  indicates  that  any 
eigenvector  v$  with  corresponding  A  ^  0  are  spanned  by 
the  input  data  4>c(xi), . . . ,  $c(xm)  -  i.e. 

M 

v<r>  =  y^/3j$c(Xi)  =  X^/3,  (26) 

i—1 

where  /3  =  (/?i, /?2,  •  •  • , Pm)T -  Substituting  (26)  into  (25) 
and  multiplying  with  4>c(xn)T,  n  =  1, . . . ,  M,  yields 

M 

A  A  C  $c(xn),  $c(xi)  >  (27) 

i—1 

M  M 

=  ^y^/3^c(x„)$c(Xi)$c(Xi)Ty^$c(Xi) 
i—1  i—1 

^  M  M 

=  $c(x„),y^$c(Xj)  <  $c(xi),$c(xi)  » 

i—1  j— 1 

for  all  n  =  1, . . . ,  M. 

We  denote  by  K&  =  K(Af&,X&)  =  (K)^  the  M  x  M 
kernel  (Gram)  matrix  whose  entries  are  the  dot  products  < 
<3>c(x^),  $c(xj)  >•  Equation  (25)  can  now  be  rewritten  as 

MX/3  =  Kb/3:  (28) 

where  /3  turn  out  to  be  the  eigenvectors  with  nonzero  eigen¬ 
values  of  the  kernel  matrix  K&,  as  shown  in  [12].  Note  that 
each  /3  need  to  be  normalized  by  the  square  root  of  its  cor¬ 
responding  eigenvalue. 

Furthermore,  we  assumed  that  the  data  was  cen¬ 
tered  in  the  feature  space,  however,  we  cannot  center  the 
data  in  the  high  dimensional  feature  space  because  we  do 
not  have  any  knowledge  about  the  non-linear  mapping  4>. 
Therefore,  we  have  to  start  with  the  original  uncentered  data 
and  the  resulting  Gram  matrix  Kb  needs  to  be  properly  cen¬ 
tered.  As  shown  in  [12],  the  centered  Gram  matrix  K&  can 
be  obtained  from  the  uncentered  Gram  Matrix  Kb  by 

Kb  =  (Kb  —  lj^Kb  —  Kbl  m  +  lAfKbljif),  (29) 


where  (1  M)ij  =  1/M  is  an  M  x  M  matrix.  From  the 
definition  of  PCA  in  the  feature  space  (25)  and  the  Kernel 
PCA  (28)  we  can  now  write  the  eigenvector  decomposition 
of  the  background  covariance  matrix  and  Gram  matrix  as 

C6#  =  V*A6V<f>T  (30) 

and 

K*  =  BnKbBT,  (31) 

respectively.  Using  pseudoinverse  matrix  properties  [16] 
the  pseudoinverse  background  covariance  matrix  and 

inverse  Gram  matrix  K^"1  can  also  be  written  as 

C&=V*A^V*t  (32) 

and 

K^=Bn-]BT,  (33) 

respectively.  From  the  relationship  between  the  eigenval¬ 
ues  of  covariance  matrix  in  the  feature  space  and  the  Gram 
matrix  described  in  (28) 

a b  =  „  (34) 

where  Ab  is  a  diagonal  matrix  with  its  diagonal  elements 
being  the  eigenvalues  of  and  Ok6  is  a  diagonal  matrix 
with  diagonal  values  equal  to  the  eigenvalues  of  the  Gram 
matrix  K&.  Substituting  (34)  into  (33)  we  obtain  the  rela¬ 
tionship 

K,-1  =  Tz?Ar^T  (35) 

where  M  is  a  constant  representing  the  total  number  of 
background  clutter  samples  which  can  be  ignored. 
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(c) 


Figure  1 :  Sample  band  images  (48th)  from  HYDICE  im¬ 
ages  and  mine  image,  (a)  the  Forest  Radiance  I  image,  (b) 
the  Desert  Radiance  II  image  and  (c)  the  hyperspectral  mine 
image. 


Hyperspectral  images 


Figure  2:  Example  of  the  dual  concentric  windows  in  the 
hyperspectral  images. 


(d)  (d) 


Figure  3:  Detection  results  for  the  Forest  Radiance  I  im-  Figure  4:  Detection  results  for  the  Desert  Radiance  II  im¬ 
age  using  the  kernel  RX-algorithm  and  conventional  RX-  age  using  the  kernel  RX-algorithm  and  conventional  RX- 
algorithm  based  on  the  local  dual  window,  (a)  Kernel  RX,  algorithm  based  on  the  local  dual  window,  (a)  Kernel  RX, 
(b)  3-D  plot  of  (a),  (c)  RX,  and  (d)  3-D  plot  of  (c).  (b)  3-D  plot  of  (a),  (c)  RX,  and  (d)  3-D  plot  of  (c). 
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Figure  7 :  ROC  curves  obtained  by  the  kernel  RX-algorithm 
based  on  the  global  and  local  kernel  matrices  and  the  con¬ 
ventional  RX-algorithm  based  on  the  local  covariance  ma¬ 
trix  for  the  Desert  Radiance  II  image. 
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Figure  5:  Detection  results  for  the  mine  image  using  the 
kernel  RX-algorithm  and  conventional  RX-algorithm  based 
on  the  local  dual  window,  (a)  Kernel  RX,  (b)  3-D  plot  of 
(a),  (c)  RX,  (d)  3-D  plot  of(c). 


Figure  6:  ROC  curves  obtained  by  the  kernel  RX-algorithm 
based  on  the  global  and  local  kernel  matrices  and  the  con¬ 
ventional  RX-algorithm  based  on  the  local  covariance  ma¬ 
trix  for  the  Forest  Radiance  I  image. 
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Figure  8:  ROC  curves  obtained  by  the  kernel  RX-algorithm 
and  the  conventional  RX-algorithm  based  on  the  local  dual 
window  for  the  hyper  spectral  mine  image. 


